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Abstract 

Iterated hairpin completion is an operation on formal languages that is 
inspired by the hairpin formation in DNA biochemistry. Iterated hairpin 
completion of a word (or more precisely a singleton language) is always 
a context-sensitive language and for some words it is known to be non- 
context-free. However, it is unknown whether regularity of iterated hair- 
pin completion of a given word is decidable. Also the question whether 
iterated hairpin completion of a word can be context-free but not regu- 
lar was asked in literature. In this paper we investigate iterated hairpin 
completions of non-crossing words and, within this setting, we are able 
to answer both questions. For non-crossing words we prove that the reg- 
ularity of iterated hairpin completions is decidable and that if iterated 
hairpin completion of a non-crossing word is not regular, then it is not 
context-free either. 

1 Introduction 

On an abstract level, a DNA single strand can be viewed as a word over the four- 
letter alphabet {A, C, G, T} where the letters represent the nucleobases adenine, 
cytosine, guanine, and thymine, respectively. The Watson-Crick complement of 
A is T and the complement of C is G. Two complementary single strands of 
opposite orientation can bond to each other and form a DNA double strand. 
Throughout the paper, we use the bar-notation for complementary strands of 
opposite orientation. 

In the same manner, a single strand can bond to itself if two of its substrands 
are complementary and do not overlap with each other. Such an intramolecular 
base pairing is called a hairpin. We are especially interested in hairpins of single 
strands of the form a — ja^a. Here, the substrand a can bond to the substrand 
a. Then, by extension, a new single strand can be synthesized which we call a 
hairpin completion of a, see Figure 1. In this situation we call the substrands 
that initiate the hairpin completion, a and a, primers. 

In DNA computing hairpins and hairpin completions are often undesired 
by-products. Therefore, sets of strands have been analyzed and designed that 
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Figure 1: Hairpin completion of a DNA single strand 



are unlikely to form hairpins or lead to other undesireable hybridization, see 
[1, 2, 7, 8, 10, 18] and the references within. 

However, there are DNA computational models that rely on hairpins, e.g., 
DNA RAM [9,20,21] and Whiplash PGR [5,19,22]. For the Whiplash PGR 
consider a single strand just like in Figure 1, but where the length of extension 
is controlled by stopper sequences. Repeating this operation, DNA can be used 
to solve combinatorial problems like the Hamiltonian path problem. 

Inspired by hairpins in biocomputing, the hairpin completion of a formal 
language has been introduced by Cheptea, Martin- Vide, and Mitrana in [3]. In 
several papers hairpin completion and its iterated variant have been investi- 
gated, see [4,12-17]. In this paper we consider iterated hairpin completions of 
singletons, that is, informally speaking, iterated hairpin completions of words. 
The class of iterated hairpin completions of singletons is denoted by HGS. It 
is known that every language in HGS is decidable in NL (non-deterministic, 
logarithmic space) as NL is closed under iterated hairpin completion [3]; hence, 
HGS is a proper subclass of the context-sensitive languages. It is also known 
that HGS contains regular as well as non-context-free languages [12]. In the 
latter paper, two open problems have been stated: 

1. Is it decidable whether the iterated hairpin completion of a singleton is 



2. Does a singleton exist whose iterated hairpin completion is context-free 
but not regular? 

We solve both questions for non-crossing words (or rather, singletons con- 
taining a non-crossing word). A word w is said to be non-crossing if, for a 
given primer a, the right-most occurrence of the factor a in w precedes the 
left-most occurrence of the factor a in w, see Section 3. We provide a necessary 
and sufficient condition for regularity of iterated hairpin completion of a given 
non-crossing word (Theorem 4.6 and Gorollary 4.9) and, since this condition is 
decidable, we answer the first question positively (Corollary 4.10). Furthermore, 
we show that iterated hairpin completion of a non-crossing word is either reg- 
ular or it is not context-free (Corollary 4.11). Thus, we give a negative answer 
to the second question. 

This paper is the continuation of the studies in [11]. 



regular? 
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2 Preliminaries 



We assume the reader to be familiar with the fundamental concepts of language 
theory, see [6]. 

Let S be an alphabet, E* be the set of all words over E, and for an integer 
A: > 0, E*^ be the set of all words of length k over E. The word of length is 
called the empty word, denoted by e, and we let E+ = E* \ {e}. A subset of 
E* is called a language over E. For a word w G E*, we employ the notation w 
when we mean the word as well as the singleton language {w} unless confusion 
arises. 

We equip E with a function : E — )■ E satisfying Va e E,a = a; such a 
function is called an involution. This involution is naturally extended to 
words as: for ai, . . . , a„ e E, 0102 • • • a„ = • • • aial. For a word w G E*, 
we call w the complement of w, being inspired by this application. A word 
w G E* is called a pseudo-palindrome if w = w. For a language L C E*, we let 
L = {w \ w e L}. 

For words u,w ^ T.* , if w — xuy holds for some words x,y G E*, then u is 
called a factor of w; a factor that is distinct from w is said to be proper. If the 
equation holds with x = e {y = e), then the factor u is especially called a prefix 
(resp. a suffix) of w. The prefix relation can be regarded as a partial order <p 
over E* whereas the proper prefix relation can be regarded as a strict order <p 
over T,*] u <p w means that m is a prefix of w and u <p w means that u is a 
proper prefix of w. Analogously, hy w >s u (or w >s u) we mean that u is a 
suffix (resp. proper suffix) of w. Note that u <p w if and only if w >s u. For a 
word w G E* and a language L C E*, a factor u of w is minimal with respect to 
L if u L and none of the proper factors of m is in L. 

2.1 Hairpin Completion 

Let fc be a constant that is assumed to be the length of a primer and let a G 
E*^ be a primer. If a given word w G E* can be written as jaf3a for some 
7,/3 G E*, then its right hairpin completion (with respect to a) results in the 
word 7a/3a7. By w -^nUa '^^ mean that z can be obtained from w by right 
hairpin completion (with respect to a). The left hairpin completion is defined 
analogously as an operation to derive jafiaj from a^aj, and the relation —^cHa 
is naturally introduced. We write w -^Ua z if w -^tzHc z or w —^cHc z. By 
— !>2^ , — , and -^y^ we denote the reflexive and transitive closure of -^cHa j 
"^tcWq: a-nd -^Ha^ respectively. Whenever a is clear from the context, we omit 
the subscript a and write ->tjh, ^cHi or ->-h, respectively. 

For a language L C E*, we define the set of words obtained by hairpin 
completion from L, and the set of words obtained by iterated hairpin completion 
from L, respectively, as follows: 

HaiL) = {z \ 3w e L,w -^Ha, z} , T~L*a{L) — {z\ 3w e L,w z} . 

In this paper the hairpin completion is always considered with respect to 
a fixed primer a. However, in other literature the hairpin completion is often 
considered with respect to the length k of primers instead of a specific primer 
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and defined as 



3 Non-crossing Words and Their Properties 

In this section, we describe some structural properties of non-crossing words and 
their iterated hairpin completions and we introduce the notation of a-prefixes, 
a-suffixes, and a-indexes. 

For a word a, we say that w is non-a- crossing if the rightmost occurrence of 
a precedes the leftmost occurrence of 5 on w (yet these factors may overlap). 
If a is understood from the context, we simply say that w is non-crossing. 
Otherwise, the word is a-crossing or crossing. The definition of a word w being 
non-a-crossing becomes useful in our work only if w G aT.* or w G I]*a, and 
therefore, a and a are primers; actually, we will assume both. The main purpose 
of this paper is to prove a necessary and sufficient condition for the regularity of 
the iterated hairpin completion H* (w), where w £ aS* nS*a is non-a-crossing. 

Note that ii w G aS* nS]*5 and a = a, then either w — a and 'H'^{'w) — {w} 
or w can be considered crossing. Thus, whenever we consider non-crossing 
words, we assume that a ^ a. 

Any word obtained from a non-crossing word by hairpin completion is non- 
crossing. Though being easily confirmed, this closure property forms the foun- 
dation of our discussions in this paper. 

Proposition 3.1. For a non-crossing word w S aS* H E*a, every word in 
'H*^(w) is non-crossing. 

Let us provide another characterization for a word w G aS* n S*5 to be 
non-crossing. With Proposition 3.1, this characterization will bring a unique 
factorization of any word z in H*(it;) as z = xwy for some words x^y (Corol- 
lary 3.3). 

Proposition 3.2. A word w G aS* H E*5 is non-crossing if and only if it 
contains exactly one factor x which is minimal with respect to aS* H T,*a. 

Proof. The word w contains at least one factor from aS* H S*5 and hence it 
contains at least one minimal factor, too. 

Suppose w contains exactly one minimal factor x from aS* n T,*a. If the 
prefix a of a; was proceeded in w by another factor a, then this factor is again 
proceeded by a and we would find a second minimal factor x' G aE* n E*a. 
Hence, the prefix of x is the rightmost occurrence of a in w and, symmetrically, 
the suffix of X is the leftmost occurrence of a in As the rightmost occurrence 
of a precedes the leftmost occurrence of a, the word w is non-crossing. 

Now suppose w contains at least two minimal factors xi, X2 from aE* nE*S. 
If these factors overlap and the overlapped part is of length at least k, we may 
assume xi — yiy2 and X2 = 2/22/3 for words 2/1,2/2,2/3 £ E"*" where 2/2, the 
overlapping part, is at least k. We see that 2/2 £ aE* nE*a and that xi,X2 were 
not minimal with respect to aE* n E*a. Therefore, xi,X2 overlap in less than 
k letters (or do not overlap at all) and w can be considered crossing. □ 

Corollary 3.3. Let w G aE* H E*a be non-crossing. The factor w occurs 
exactly once in every word from H* {w) . 
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3.1 a-Prefixes and a-SufRxes 

Let u,v,w be words. We call u an a-prefix of w if ua <p w. This means, if a 
is a suffix of then the suffix can bond to the factor a which directly follows 
the prefix u (unless they overlap with each other) and wu can be obtained from 
w by right hairpin completion. By Paiw), we denote the set of all a-prefixes of 
w. Note that if a::,y G Pa{w) and < |y|, then xa <p ya and x G Pa(?/a). 
Symmetrically, we call v an a-sujfix if w >s av and Sa(w) is the set of all 
a-suffixes of w. li a <p w and \w\ > \v\ + 2k, then w -^cH '^w. Note that 
= Paiw). Therefore, all results we derive for a-prefixes also hold for the 
complements of a-suffixes. 

When TO = |Pq,(u))| and n = |S5(w)| for a word w, then w is called {m,n)- 
a-word (or simply {m,n)-word). Throughout the paper, it will be convenient 
to let Pa(u)) — {mq, . . . , Um-i} and Sq(w) = {vq, . . . , where the words 

arc ordered such that uo <p • • • <p u„i-i and Vn-i >s • • • >s Note that 
Va(uia) — {uq, . . . , Ui} for < i < TO and Sa(otVj) — {vq, . . . , vj} for < j < 

Let us begin with a basic observation. 

Lemma 3.4. For a word w e aS* n f/ie following statements hold: 

1. For every x G V a{w) U 85(11;), we have a <p xa. 

2. For every xi, . . . ,xi G Pq(w) U Sq(w), we have a <p xi • ■ • x\a. 

Proof. If a; g Pc(u'), the first statement derives directly from the definition. If 
X € Saiw), then a is a suffix of ax, whence a <p xa. 

The second statement follows by induction on i and the first statement. □ 

Consider w G aS* n Note that this means uo ~ vq = e. It is easy to 

see that every word z which belongs to 'Ha{w) has a factorization z = wu for 
some u G Pa(w) or z = vw for some v G Saiw). By the previous lemma we see 
that z G aE* n E*a and by induction (w) C aE* n E*a. 

The next lemma tells if w G aE* n E*S is a non-crossing (TO,n)-word with 
n > 2, then the suffix a does not overlap with any of the factors a and, therefore, 
w -^-RH wu for all u G Pq(u'). 

Lemma 3.5. Let w G aE* nE*a 6e a non-crossing (m, n)-word with n > 2 and 
let Um-i he the longest a-prefix in Y'a{w). Then \u„i-i\ + 2A: < \w\ holds. 

Proof. Let Pq(w) = {uq, . . . , u,n-i} such that uq <p ■ • • <p Um-i and S5(u') = 
{!Jo, . . . , Vn-i} such that Wn-i >s ■ ■ ■ >sVq as usual. Suppose that the inequality 
+ |wn-2| + 2fc < |w;| did not hold (note that this inequality is stronger 
than the one proposed in our claim). Being non-crossing, w can be written as 
w — Um-ixVn-2 for some word x G aE* n E*a with |a:| < 2k. Hence x — x. 
Let y be the nonempty word satisfying Vm-i — yVm-2- Since w is non-crossing, 
xVn-2 >s Vn-i must hold, from which we have x >s ay. Combining this with 
X = X enables us to find an a-prefix Um-iy of w, but this would be longer than 
the longest a-prefix oi w — a contradiction. □ 

Since the analogous argument is valid for left hairpin completion. Lemma 3.5 
leads us to one important corollary on non-crossing (to, n)-words for m,n >2. 
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Corollary 3.6. Letw G aI]*nS*a be a non-crossing (m, n)-word with m,n >2. 

T~ia{'w) = {w} U wVa{w) U Sa(w)w. 

Next, we concern the case when there is a prefix u G Pct(w) and a suffix 
V G 85(1/7) such that v G Pc(uq;)* and m G 85(01))*. 

Lemma 3.7. Let w G aE* n I]*a be an [m,n)-word. For u G Pq(w) anc? 
w G 85(w), 

1. if V G Vaiua)* , then Sa{oiv) C V a{ua)* . 

2. if V G Vaiua)* and u G Ss(5w)*, then Vaiua)* = 85(01))*. 

Proof. Let Pq(w) = {uq, ■ • • , "m-i} such that uq <p • • • <p Wm-i and 85(1(7) = 
{v^, . . . , Vn-i} such that i)„_i >s • • • >s itq. We may assume u = Ug for some 
< s < m. Recall that Pq(uq;) — {uq, . . . , it^}. 

For the first statement, let i; = xi ■ • • a;^ where xi, . . . ,X£ G {uq, . . . , its}. For 
< j < t we have Vja < va, hence, there is 1 < i < £ such that Vj = xi ■ ■ ■ Xi^iy 
where y <p Xi and ya <p Xi ■ ■ ■ x^a. By Lemma 3.4, we see that ya <p xia <p w 
and hence y G {iti, . . . , itg}. Therefore, Vj G {iti, . . . , it^}*. 

The second statement follows immediately by the symmetry between prefixes 
and complemented suffixes. □ 

Lemma 4.3 and Corollary 4.4 in in Section 4 will describe the consequences 
for the iterated hairpin completion of a non-crossing word w, if we find such a 
situation. 

3.2 The a-Index 

The a-index of a word x is the number of occurrences of the factor a in the 
word xa except for the suffix. Formally, we define a function inda : E* — N as 

indQ(a;) = |Pct(a;Q!)| — 1. 

Recall that Pq(u') = {uq, ■ ■ ■ ,Um-i} such that uq <p • • • <p itm-i- Note 
that, by this ordering, for all < « < m we have inda(wi) = i and if x G Pq(k7), 
then X = U[ndc{x)- Symmetrically, if x G 85(1(7), then x = i'indQ(K)- 

Also note that for words x,y with indQ,(x) > inda(i/) the word x cannot 
be a factor of y as the positions of the factors a cannot match. Later we will 
consider the a-indices of words from aS*a~^ (Note that x G aY.*a^^ if and only 
if a <p xa). If a; G aE*a~^ and y G E*, then indQ(i/a;) — mda{y) + indQ(a;). 
These observations lead to the following properties. 

Lemma 3.8. Let Pq,(u;) — {itQ, . . . ,Um-i} such that uq <p ■■ ■ <p Um-i, let 
X G aYl*a^^ , and let < j < m. If x is a suffix ofuj, then uj — iij-indQ(x)2;. 
In particular, if w G aY,* and Uj >s Ui for < i < j < m, then 

Proof. Let y such that Uj = yx. As a <p xa, we have ya <p uja <p w, 
and hence, y G Pq,(u'). As iiida{y) — indQ(iij) — indQ(a;), it follows that y = 



6 



Lemma 3.9. Let Pq(w) = {mq, . . . , u„i_i} such that uq <p ■■■ <p Um~i, let 
I < i < m, and let x be a word with inda(a;) < i. 

X G {ui,...,Ui}* a; e {ui, . . . ,Wi„d„(x)}* • 

Proof. The implication from right to left is plain. 

Conversely, let x = yi ■ • • with yi,.. .,yi e {ui, . . . , Ui}. As inda(x) > 
indaiyj) for all 1 < j < we see that, actually, yi, . . . ,ye e {ui, . . . , Ui„d„ (j^) } ■ 
Hence, x G {ui, . . . , Ui,id„(a;)} , as desired. □ 

4 Iterated Hairpin Completion of Non-crossing 
Words 

Now we are prepared to prove a necessary and sufficient condition for the reg- 
ularity of where w G aS* n T,*a is a non-crossing (m,n)-word with 
Pa(w) = {uo, . . . , Um-i} and Sq(?x') = {vq, . . . , u„_i} which are ordered as in 
the previous section. (Keep in mind that uq ~ vq — s.) By a result from [11] it 
is enough to consider the case where m,n > 2 and in this case, by Corollary 3.6, 

Haiw) = {w} U W {ul, Um-l} U {wi, . . . , Vn-l}w. 

Theorem 4.1 (See [11]). Ifw G aE* nS*a is a non-crossing {m,n)-word with 
m = I or n — \, then H* (w) is regular. 

The next two lemmas lead to a first sufficient condition for the regularity of 
'H*^{w), see Corollary 4.4. 

Lemma 4.2. For non-crossing w G aS* D E*a, 

n^w) C (P,(u.) U S^) * w (p:J^ U Soiiw)^ * . 

Proof. For every word z G H*(w), by definition, we find a series of words 
w = wo,wi, . . . ,Wi = z for some £ > such that 

Wo iwi Wi. 

We prove 2 = W£ G (Pq(w)USq(iu))*w(Pq(z«)USq(u;))* by induction on £. For 
£ = this is plain. Now we assume that any word which can be derived from w 
by at most i — 1 hairpin completions is in this set. By induction hypothesis, 

Wi-i ^Xs-'-Xiwyl-'-yl 

for some s,t> Q and xi, . . . ,Xs,yi, ■ . ■ ,ys G Paiw)USa{w). Hence, Pa(w£_i) C 
(PQ(?i')USs(w)) and Ss(w£_i) C (PQ(ii;)uSa(w)) . This proves our claim. □ 

Lemma 4.3. Let w G aS* H Y,*a be non-crossing. Lf for some u G Pq,(w) and 
V G 85(1(7) we have v G Pa(ua)* and u G Sa(au)*, f/ien 

PQ(Ma)*?i'PQ(Ma)* = S5(aw)*wSa(au)* C %*^[w). 
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Proof. Let Pq(w) = {uq, . . . , Um-i} such that uq <p • • • <p u„i-i and 85(1(7) = 
. . . , such that iv;TT >s ■ ■ ■ >s vo- 

First we prove that for the special case when u — v, P a{ua)* wP a{ua)* C 
'H*^{w). Afterwards, we wiU use this fact in order to prove our claim. 

Suppose u — V £ P a{w) n 85(1(7) and let £ — indQ(it). This implies Ui — Vi 
for all < i < £. Let 

z = Xs- ■ -xiwyl- ■ - ft 

where s,t > and xi, . . . , Xs,yi, ■ ■ ■ ,yt G {uo, ■ ■ ■ , u^} = Pc(ita). Let s' and t' 
be maximal such that indct(a;s') = £ and indaij/t') = £', or if no such index 
exists. We let w' = Xs' • • ■ xiwyl ■ ■ ■ yp. Note that aul is a sufSx of w' and 
hence 

w -J-^^ wyl ■■■W ^CH Xs' ■■■ xiwyl ■■■yp ^w'. 

We have w' € T-L%{w) and z G {uq, . . . , w' {ito, . ■ . , ui^i}* ■ We may 

continue inductively and conclude z e 'H* (u;). 

Now let 17 G Pa{ua)* and u G 85 (au)* as in the claim and let £ < indQ(it) 
be the minimal index such that v G {ito, . ■ . , ug]* . Note that 

Pa(lta)* = Sa{av)* = {lio, ■ ■ ■ , ut}* , 

by Lemma 3.7. The minimality of £ yields ui {lio, ■ • ■ ,ii£-i}* and ui is a 
factor of 1), which means that ^ < indQ(i') < n. We claim that ui — vi. Indeed, 
if we had ui ^ Vi, then e {i;o, . . . , Wf-i}* C {itg, . . . , it£_i}* (see Lemma 3.9). 

As = Wf, our observations made for the special case apply and we may 
conclude 

Pa{u£a)*wPa{uia)* — PQ(ita)*K7PQ,(ita)* = Sa{av)*wSa{av)* C 7^* (u;). 

□ 

Suppose that the longest a-prefix Um-i belongs to 85(111)* and the longest 
a-suffix Vn-i belongs to PQ,(iy)*. By Lemma 3.7, we see that Pa{w)* = 85(1(7)* 
and, by Lemmas 4.2 and 4.3, we infer 'H*(ii7) = PQ(u;)*iyPa(ii;)*. (Note that 

Pa{w) = Pa(Um-ia)-) 

Corollary 4.4. Let w G aE* H E*5 be non-crossing, let Wm-i be the longest 
a-prefix of w, and let Vn-i be the longest a-suffix of w. If Um-i £ 85(11;)* and 
Vn-i G Pa{w)* , then Ti-aiw) is regular. 

Our next result, Theorem 4.6, shows that if we can state a necessary and 
sufficient condition for the special case where ui = wi, then we can extend this 
condition to the general case. We need a preliminary lemma in order to prove 
Theorem 4.6. 

Lemma 4.5. Let w G aT.* D E*5 be a non-crossing (m, n)-word with m, n > 2, 
let Pa{w) — {liQ, ■ . . , itjn-i} such that uq <p ■■■ <p Um-i, md let 85(1(7) = 
{v^, . . . , 17^} such that >s ■ ■ ■ >s Vo- 

1. If ui ~ vi, then 'H'^{w) C maT,* nT,*aui. 

2. If ui ^ vi, then ^^{wul) r\'H*^{vjw) — for all \ <i < m and 1 < j < n. 

3. Let 1 < i < j < m. If uj >s Ui, then H'^iwu]) C T-L*^(wui); otherwise, 



8 



Proof. Consider a word z with prefix uia and suffix aui. It is easy to see that 
any word which is a hairpin completion of z has prefix uia and suffix aui, as 
weU. Statement 1 follows by induction. 

Note that via is not a proper prefix of uia; otherwise, vi would be a a-prefix 
of w which is longer than uq = e but shorter than ui. Symmetrically, wia is not a 
proper prefix of via. By the first statement, we have 'H*^{wui) C uiaYi* C\Y,*aui 
and (uiw) C vioZ,* n T,*avi for all 1 < i < m and 1 < j < n. Therefore, 
if ui 7^ wi, the intersection 'H*^{wui) fl 'H*^{vjw) has to be empty which proves 
statement 2. 

For statement 3, assume first Uj >s Ui. By Lemma 3.8 we obtain Uj — Uj^iUi 
and hence wuj = wuluj^ £ T-i*^{wui). We conclude 'H*^{wuj) C 'H'^{wui). 

Now assume Ui is not a suffix of Uj. As wul (resp. wuj) is a factor of every 
word in 'H*^{wui) (resp. 'H*^{wuj)) and there is only one factor w in every word 
from (w), it is plain that the intersection 'H*^{wui) r\'H*^{wuj) is empty (see 
Corollary 3.3). □ 

Let us define the index sets 

^={*| l<i<mAui^ {ui, . . . ,iti_i}*} , 
J ^ {.i \ ^ < j < n Avj {vi,. . . . 

Thus, for all i G /, no proper suffix of Ui belongs to Pa{w) and for all j € J, no 
proper prefix of v] belongs to Sa{w), see Lemma 3.8. By the previous lemma, 
if I'l ^ ui, then (w) is the disjoint union 

nUw) = {w}u\J niiwui) u U niiv.w). 

iei jeJ 

Note that for every word wul with i G I or VjW with j G J, the shortest 
a-prefix is complementary to the shortest cE-suffix. This observation leads us to 
an important theorem that allows us to reduce the general case to the special 
case where ui = vi. 

Theorem 4.6. Let w G aE* n S*5 be a non-crossing (m, n)-word with m,n> 
2, let Pq(u') = {itQ, . . . , Um-i} such that uq <p ■■■ <p Um-i, let Saiw) — 
{vq, . . . , Vy^i} such that Vn-i >s ■ ■ ■ >s vo, and define I, J as above. 

For ui ^ v\, (w) is regular if and only ifH^iwul) is regular for all i E I 
and T-Laivjw) is regular for all j G J . 

Proof. As we already stated 

ul{w) = W u U niiwui) u U niiv.w). 

If every language in the union is regular, then the union itself is regular. 
Conversely, assume "H* (wu;) is not regular for some i G /, then the intersection 

'H*a{w) n {uiolYi* n S*Siti) n E*wuil]* — 'H*^{wui) 

is not regular and hence H*(w) is not regular either, by Lemma 4.5. The 
argument for non-regular 'H*a{vjw) with j G J is analogous. □ 
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Theorem 4.6 justifies the assumption ui — v\ that we make from now on. 
The next two theorems prove a necessary and sufficient condition for the regu- 
larity of (ui). We start by proving that the condition is sufficient. 

Theorem 4.7. Let w G aS]*nl]*a be a non-crossing (to, n)-word with m, n > 2, 
let Vaiw) — {uq, . . . , Wm-i} such that uq <p ■■■ <p Um~i, and let Saiw) — 
{v^, .. ., such that tJ^jTT >s • • • >s wo- 

'H*^{w) is regular if both of the following conditions hold: 

1. for all 1 < s < m, either Ug G 85(11;)* or 85(1(7)* C {ui^ . . . , its}*, c^nd 

2. for all I <t < n, either Vt G Pa{w)* or Pa{w) C {vi, . . . , Vt\*t. 

Proof. Assume that both the conditions 1 and 2 are satisfied. We may assume 
that there is 1 < s < m such that Ug ^ 85(11;)* or there is 1 < t < n such 
that Vt ^ Pq.(k;)*; otherwise Corollary 4.4 yields the regularity. In addition, 
we cannot assume the existence of both such s and i as its ^ 85(10)* implies 
85(10)* C {iti, . . . , lis}* due to the condition 1. The symmetry in the roles 
of conditions 1 and 2 enables us to assume that such s exists without loss of 
generality, and moreover, we can assume that for all 1 < i < s, it^ G 85(10)* and 
for all s < i < m, Ui ^ 85(11;)* by Lemma 3.7. 

Let R' = S5(ii')*u;{iti, . . . , iis_i}* and for s < i < to, let 

i?i = {ill, . . . , iii}*io{irr, . . . , ill}*ii^{irr, • • • , ulY- 

Then we let R = Us<*<m ^ R' and we claim 7^* (10) = R. 

Firstly, we prove that R C 7^* (10). 8ince m, n > 2, Corollary 3.6 can be 
used to see that w{ui, . . . , iii}*it7 C 'H*^{w) for s < i < m, and by Lemma 4.3, 

Ri = {ui, . . . ,Ui}*w{W[, . . . ,Wi}*Wi{ui, ■ ■ ■ ^uiV C Ha{w). 

Consider z G R' . We may factorize z = Xi ■ ■ ■ xiwyi ■ ■ -y] where xi, . . . , Xi G 
85(1/)) and yi,...,yj G {iti, . . . , its_i}. Let I < t < n he minimal such that 
Vt ^ {ui, . . . , its_i}*. As Vt G {iti, . . . , lis}*, we see that its is a factor of vt 
and t > s. By Lemma 3.9, its-i G {wi, . . . , Wf-i}* and, by the minimality of 
t, Vt-i G {lii, . . . , its-i}*. If mda{xi) < t for all 1 < < i, then, due to 
Lemma 4.3, 

z G {vi, . . . ,vt-i}* w {u[, . . . ,v^}* C H*(io). 

Otherwise, let 1 < £ < i be maximal such that inda{xg) > t and let w' = 
Xi ■ ■ ■ Xiw. Observe that w w' and, again by Lemma 4.3, 

z G W, . . . , Vt-iY w'{-u..., C Hl{w') c ni{w). 

Thus, R C H*(io). 

Now we prove the opposite inclusion by induction on the length of a deriva- 
tion to generate a word in 'H*(ii>) from w. Clearly, w G R (base case). As 
an induction hypothesis, we assume that any word which can be derived from 
10 by ^ — 1 hairpin completions is in R, and consider z G 'H*^{w) whose short- 
est derivation from w by hairpin completions is of length Let w' be the 
word that precedes z on this shortest derivation, that is, w' G H^~^(u'); hence, 
w' G i? by the induction hypothesis. Therefore, w' must be either in Ri for 
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some s < i < TO or in i?'. Let us consider the first case first. If z is obtained 
from w' by right hairpin completion, then the complement of the extended part 
is in {ui, . . . , Mi}*{e, ui, . . . , w„i_i}, and hence, z G Rj for some i < j < m. 
Otherwise {w' -^cH z), z € Ri because Sq(w') C {ui, . . . , Ug}* C {ui, . . . , U j}* . 
Next we consider the case when w' G R'. Since . . . , w^-i}* C S5(?«)* it 
is easy to see that if w' -^CH then z G -R' as well. Otherwise [w' -^im z), 
z £ w' {e, ui, . . . , Um-i} Saiw)* and, as S5(?i;)* C {ir[, . . . , TQ}* , if z ^ R' , this 
word is covered by some language Ri where s < i < m. 

Consequently, 'H*^{w) = R is regular. □ □ 

Theorem 4.8. Let w G aE* n S*5 be a non-crossing (m, n)-word with m,n> 
2, let Pq,(u') — {uq, . . . , such that uq <p ■■■ <p Um-i, let Sq(i«) = 

{vq, . . . , Vy-i} such that Vn-i >s ■ ■ ■ >s vq, and let ui = vi. 

1. 'H*^{w) is not regular if there are 1 < s < m and 1 < t < n such that 
Us ^ {wi, . . . ,w„„i}* and Vt ^ {ui, . . • 

2. 'H*^{w) is not regular if there are 1 < s < m and 1 < t < n such that 
Us ^ {vi,...,vt}* andvt ^ {ui, . . . , u,„_i}* . 

Proof. Let s and t be the minimal indices that satisfy the conditions in state- 
ment 1. Note that s,t>2 and Us,vt ^ u^ as ui = vi. We will first argue, why 
the assumption s < t is no restriction. 

Let us consider the case where the conditions in statement 1 are satisfied, 
but the conditions in statement 2 are not satisfied. It is easy to see that 
Us ^ {vi, ■ . ■ ,Vt}* is satisfied anyway. By contradiction assume s > t. Due 
to Lemma 3.9, 



This satisfies the conditions of statement 2 and yields the contradiction. We 
conclude s < t. 

Now, let us consider the case where the conditions of both statements are 
satisfied. Let s and t' be the minimal indices such that Us ^ {vi, . . . ,Vn-i}* 
and Vf ^ . . . , We may assume s < t' , by symmetry. Note that 

vt'-i G {ui, . . . , by the minimality of t'. If Vf^i G {ui, . . . , Vs}* , then 

we see that s and t = t' are the minimal indices that satisfy the conditions in 
statement 1 and s < t. Otherwise, there is a factorization Vf-i = xuiy where 
X G . . . , Its}*, s < i < m, and y G {ui, . . . , Um-i}*- Note that s and 
t — indQ(x) + s + 1 (hence vt = xus+i) are the minimal indices that satisfy the 
conditions of statement 1 and, obviously, s < t. 

Observe that the minimality of s yields ui, . . . , Us-i G {wi, . . . , Vn-i}* ■ If 
X G {ui, . . . ,Us-i,vi, . . . ,Vn-i} was a suffix of u^, then Us = Us-inddx)^ £ 
{«!, . . . , Vn-i}* , due to Lemma 3.8. Hence, none of these words is a suffix of Us- 
Symmetrically, none of the words Ui, . . . , Us, wi, . . . , Vt^i is a suffix of Vt- This 
observation will become crucial later. 

We will now define a regular language R and show that the intersection 
T-L*^{w) n i? is not regular and, therefore, (w) is not regular either. We let 




Vt i {ui, . . . ,-Ut} 




n -^n >n — 

K — UsUY VtWUi- Us 
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and we claim 

(w) n i? = {usulvtWul'iTs I £>n}=:L, 
which is obviously not regular. Note that for every £ > n 

w wul^ -^im wul'^u's -^oi Usu\vtVjWiWs. 

Hence, ■Hl{w)nR2 L. 

Let z — UsUi^VtWUi^^iQ for some ^1,^2 ^ which is in R. We assume 
z £ 'H*^{w) and prove that this assumption requires li = £2- Let z' be the right- 
most word in the derivation w z' — >^ z such that z' = xwy for some words 
x,y with Mj^'^iit >s X and y <p ul^^; these conditions mean that x or y does 
not overlap with the prefix Ug or the suffix u^, respectively. By right-most we 
mean that either z' -^cH x'z' — >^ z where x'x >s u^^Vt or z' — ?-rm z'y' — >^ 2: 
where ui^^ <p yy'\ this means x' or y' overlaps with the prefix Us or the suffix 
ul^ respectively. Obviously, y G ui* . Note that if a; 7^ e, then x cannot be a 
proper suffix of Vt] otherwise a word from ui Vt-i would be a suffix of 

Vt which was excluded. Hence, x = e oi x £ u{vt. 

First consider the case z' -^cH x'z' z where x'x >s u^^vt- We show that 
this case cannot occur. Let u' ^ e be the suffix of Us such that u'u^^vt — x'x. 
As x' £ u\ {wi, . . . and u'a <p x'a, some word from ui — vi,. . . ,w„_i 

would be a suffix of Us- 

Now consider the case z' -^nu z'y' —^-^ z where ui^^ <p yy' . Again, let 
u' ^ e he the suffix of Ug such that ui^^u' — yy' . As u'a is a prefix of xum-iot 
and none of the words vi,...,vt, mi,...,Us-i is a suffix of Us, we see that 
u' = Us and x = e. 

Thus, in order to generate z from w by iterated hairpin completion, the 
derivation process must be of the form 

w ~^TZH wui^^Ws UgU^iVtWUi^'^ul — z. 

Let a; be a (newly chosen) word such that wui^^TT^ —>CH xwul^^ul is the 
first left hairpin completion in the derivation above. Therefore, xa is a prefix 
of UgU^iVn-ict and x is a suffix of UgU^^vt- In particular, every suffix y of a; 
with indc((y) < t is a suffix of vt- Recall that hiAa{us) = s < t. If xa was a 
prefix of UsCt, then some word from ui, . . . ,Us would be a suffix of vt which is 
impossible. Verify that x G UgU'l and x = UsU^^Vj with 1 < j < t would also 
impose a forbidden suffix for vt. Thus, we see that x = UgU^iVj with t < j < n. 
The case j > t is not possible as it implies Vta <p Vja = u'[~^vta and a word 
from 1*1 — vi, . . . ^ vt~i would be a suffix of Vt- Therefore, x = UgU^^vt and since 
a; is a suffix of UsU^iVt and ui is not a suffix of Us we deduce UsU^iVt = UgU^iVt. 

Consequentially, z G "H* (ly) if and only if £1 = £2- This completes the proof 
oiU*^{w)r\R = L. □ □ 

Combining Theorems 4.7 and 4.8, we conclude: 

Corollary 4.9. Let w G aS* H 'E*a be a non-crossing (m, n)-word with m,n> 
2, let Va{'w) = {uq, . . . , u„i_i} such that uq <p ■■■ <p Um-i, let Sa{w) — 
{vq, . . . , Vy-i} such that Vn-i >s ■ ■ ■ >s va, and let ui — vi. 
(w) is regular if and only if 
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1. for all 1 < s < m we have Us 6 Sa{w)* or Sq(w) C {ui, . . . , Us}* and 

2. for all 1 < t < n we have Vt € Pq(w)* or Pq(w) C {vi, . . . , Vt}* ■ 

Proof. Theorem 4.7 provides the if-part. 

For the only-if-part, assume that condition 1 is breached. There exists 1 < 
s < m such that Us ^ Saiw)* and Saiw) \ {ui, . . . ,Us}* ^ 0. Let 1 < t < n 
such that Vt £ Saiw) \ {ui, . . . ,Us}* . We see that s,t are indices satisfying 
the conditions of statement 1 in Theorem 4.8 and, therefore, is not 

regular. □ 

Thus, we provided a necessary and sufficient condition for the regularity of a 
non-crossing (m, n)-word. As one can easily observe, this condition is decidable. 

Corollary 4.10. For a given non-a- crossing word w, it is decidable whether or 
not its iterated hairpin completion, 'H'^{w), is regular. 

Furthermore, we can derive from the proof of Theorem 4.8 that if the iterated 
hairpin completion of w is not regular, then the intersection of 'Hq(w) with 
R = UsUY^ Vtwui-'^ul is not a regular language (for suitable s, t and in case 
ui ^ vi). More precisely, we obtained the context-free language 



Consider we intersect 'H*^{w) with R' — {usUY"vt)'^wui-"us. Using the 
same arguments as we did within the proof of Theorem 4.8, we can show that 



which is a non-context-free language. Using this idea we can proof that if H* (w) 
is not regular, it is not context-free either. The details of this proof are left for 
the interested reader. 

Corollary 4.11. Let w be a non-a- crossing word. If its iterated hairpin com- 
pletion H^iw) is not regular, then 'H^iw) is not context-free. 

Final Remarks 

We prove that regularity of iterated hairpin completion a given of non-crossing 
word is decidable. The general case, including that of crossing words, remains 
to be explored. 
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